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0 Jump prediction. 

0 A jump prediction cin:uit predicts the outcome of a conditional Jump instructioh and is of particuiar use in a 
pipelined processor. An initial guess is fonmed, based on tfie value of the jump parameter in the Instruction. A 
random-access memory stores the history of the outcome of previously executed Jump lnstructk)ns and is used, 
when valid, to correct the initial guess to produce a final jump prediction. 
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Jump PredicUon. 



Background to the invention. 

This invention relates to apparatus for predicting ttie outcome of jump (branch) instructions. 
5 The invention is paiticularly, aittiough not exclusively, concerned witti jump prediction in a pipelined 
data processing system. 

In a data processing system, Instructions are normally executed sequentially. However, a jump 
instruction can specify ttiat a jump is to be made out of this nonmal sequence. The jump instruction may be 
' unconditional, which means tiiat a jump is made whenever tiie instruction is executed. Alternatively, the 
70 jump instruction, may be conditional, which means ttiat a Jump is made only If a specified condition (e.g. the 
contents of an accumulator register are greater ttian zero) is satisfied. A jump Instruction may be absolute, 
which means ttiat a jump is made to a specified absolute address. Alternatively, ttie jump Instruction may 
be relative, which means ttiat a jump is made to an address displaced by a specified amount from the 
cunrent instruction. 

IS In tite case of a pipelined processor, conditional jump instructions present a particular problem. In 
general, the actual condition upon which the jump depends will not be available -until the instruction 
approaches tite end of the pipeline. If ttie condition indicates that a jump is to be made, then all later 
instructions titat have been started In tiie pipeline will be Invalid and must be abandoned. Clearly, tills slows 
down tite operation of ttie system. 

20 One way of reducing ttie problem is to attempt to predict ttte likely outcome Oump/ho jump) of ttie 
conditional jump instruction, and to prefetch ttie next instruction Into ttie pipeline on ttie basis of ttiis 
prediction. If ttie predictions are connect ttien it is not necessary to abandon any subsequent instructions 
and so the operation of the pipeline can continue without any hoki-ups. 

One way of predicting tiie outcomes of Jump instructions is to maintain a table which records the 

25 outcomes of previously executed jump instmctions at given memory tocations. Whenever a jump instruction 
is encountered at one of tti^ given memory locations, ttie table is accessed to provide a predk;tion. on 
the assumption ttiat the outcome will be ttie same as last time the Instruction was executed. One such 
predkrtion rnedianlsm Is described in US Patent Specification No. 4477 872. 

One object of ttie present invention is to provide an improved apparatus for precficting ttie outcome of 

30 conditional jump instructions. 



Summary of ttie Invention. 

35 According to ttie invention there is provided data processing apparatus comprising: 

(a) means for fetching a series of instructions, including conditional jump instructions. 

(b) means for executing ttie series of instructions, and 

(c) prediction means for predicting the outcomes of execution of the conditional jump instructions on 
ttie basis of a stored history of previous occurences of those instructions. 

40 wherein when a -stored history is not available for a partkxilar conditional jump instruction, the prediction 
means predicts ttie outcome of that instruction on ttie basis of an internal attribute of ttiat instruction. 



Bri^ description of the drawings. 

45 

One embodiment of the invention will now be described by way of example wHh reference to ttie 
accompanying drawings. 

Hgure 1 is an overall view of a pipelined data processing system, including an instmction scheduler. 
Figure 2 shows the instruction scheduler In more detail. 
50 Rgure 3 shows a jump prediction circuit forming part of the instruction scheduler. 



Description of an emtiodiment of Uie Invention. 
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Referring to Figure 1 , this shows a pipeline processing system comprising four pipeline units as follows: 
an Instruction scheduter 10. an upper pipeline unit 11. a data slave 12, and a lower pipeline unit 13. Each of 
these pipeline units itself comprises a sub-pipeline, consisting of a number of pipeline stages connected 
together in series. 

5 The scheduler 10 prefetches a series of instructions and passes them to the upper pipeline 11. 
Normally, the instructions are fetched sequentially, from consecutive memory locations. In the case of 
certain conditional jump Instructions, the scheduler 10 makes a prediction of the iilcely outcome of the 
instruction, and prefetches the next instruction on that basis, as will be described. 

The upper pipeline 11 receives instructions from the scheduler 10 and processes them so as to 

10 generate operand addresses. The addresses are passed to the data slave 1 2. 

The data slave retrieves the operands, if necessary by retrieving them from a main memory (not 
shown). The retrieved operands are passed to the lower pipeline 13. 

The lower pipeline 13 perftxms arithmetic or logic operations on the operand, as specified by the 
instruction, the result being used to update an internal register or written back into the data slave and/or the 

75 main memory. In the case of a conditfonal jump instruction, the lower pipeline 13 detects whether or not a 
specified jump condition has been satisfied, and hence deddes whether or not a jump shouM have been 
made. If the scheduler 10 made a wrong prediction for this instruction, then the subsequent Instructions in 
the pipeline are abandoned, and the lower pipeline 13 supplies a conrected program counter value CPC to 
the scheduler, to cause it to fetch the next instruction. 

20 The upper pipeline, data slave, and kywer pipeline may ail be conventional units and so it Is not 
necessary to describe them in any further detail The scheduler will be described in more d^l betow. 



Instrudfon Rmnat 

25 

The instructfons may be either In 1 6 bit (half-word) or 32-bit (full word) fonmat 

The instructions are fetched from the memory in double-word blocks (i.e. four half-words). Each block 
may contain a mixture of 16-bit and 32-bit instructions. The instructions are all aligned with half-word 
boundaries in the block, tujt the 32-bft instructions are not necessarily aligned with the fuJI-word boundaries. 
30 Hence, a 10-blt Instruction, or the first half of a 32-btt instruction, can lie in any half-word locatfon. 

Full details of the instructton format may be obtained, from The ICL 2900 series' by J.K. Buckle. 
Macmillan Press Ltd, 197a 

Each instruction contains a function code F which indicates the operation to be performed by the 
Instructton. These include the three relative conditionai jump function codes JCC Qump on conditton code), 
35 JAT (Jump on arithmetic conditton tnie) and dAF Qump on arithmetic condition false). 

Each instruction also contains a parameter N. bi the case of a relative Bteral jump, this parameter is 
interpreted as a displacement vakie. to be added to the current instruction address to produce the jump 
destination address. The displacement may be either posit^e or negatWe, so that the jump may either be 
forward or backward. 

40 

Scheduler 

Refening now to Figure 2, tNs shows the instruction scheduler 10 in greater detail. 
45 The scheduler includes program counter (PC) register 20, whteh produces an instructton prefetoh 
address CVSA indicating the address of the next block of instructions to be prefetched. 

The contents of register 20 are nonnafly incremented at successive dock beats by means of an adder 
circuit 21. and then written back into the register 20, so as to cause btocks of instructions to be prefetched 
sequentially. However, if a jump is predicted, a multiplexer 22 is switched, and this causes a predicted jump 
50 destination address PCN to be loaded into the register 20. Alternatively, in the case of a wrongly predicted 
jump detected by the tower pipeline 13. a multiplexer 23 is switohed so as to toad the conrected program 
counter value CPC Into the register 20. In either case after the register 20 has been updated by PCN or 
CPC. nonmal sequential addressing continues. 

The instmctton prefetch address CVSA Is applied to a code slave store 24, so as to retrieve a 2-word 
55 block (84 bite) of instruction date from the code slave. If the required block is not present in the code slave, 
it is retrieved from the main memory (not shown) and toaded into the code slave 24. 

Each block of instructions retrieved from the code slave is written into an Instruction buffer 25, which 
holds six double-word blocks. Instructions are read out of the instniction buffer sequentially by way of a 
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multiplexer 26. and passed to the upper pipeline unit 11. If the instruction buffer is empty, the multiplexer 
26 is switched so as to allow the next Instruction from the code slave to by-pass the instmction buffer. 

Each block of instructions from the code slave is also written into a jump buffer 27. having three double- 
word block locations. The jump buffer is smaller than the instruction buffer because the infonnation in it Is 

5 processed more rapidly and retention of code Is unnecessary in this buffer. The double-word blocks are 
read out sequentially from the jump buffer 27 by way of a multiplexer 29, and passed to a jump prediction 
circuit 2a If the jump buffer 27 is empty, the multiplexer 29 Is switched so as to cause the next btock to be 
fed directly to the prediction circuit 28, by-passing the buffer 27. The jump prediction circuit 28 produces, 
the predicted jump destination address PCN. 

70 it shouki be noted that the jump buffer nonmaily processes instructions well in advance of the 
Instruction buffer. This altows jump to be predteted and code fetched before the instruction buffer is empty 
as a result of the jump, or at least reduces the empty time to a minimum. 



15 Jump predict drcuit 

Refening now to Rgure 3. this shows the jump predict circuit 28 in detail. 

As mentioned atx>ve, the jump prediction drcuit receives a double-word btock of instructions from the 
jump buffer or from the by-pass around the buffer. 
20 The doubfe word block Is applied to a jump detection fegic drcuit 30, which decodes the contents of 
the bkx:k. so as to detenmine the posttfen of each relative jump instructton (d any) in the btock. The circuit 
30 produces the foltowing output signals. 

(1) A two-bit signal LAPCSEL This Indteates the half-word tocatton of the start of the currently 
detected reiath^ jump instrudlon, and is stored in a register 3a 
25 (2) A two43lt signal NSEL This indicates the position of the parameter N of the detected relative jump 

instnictfon. NSH. is used to control a multlptexer 35 so as to select the specified parameter N, whfch is 
then loaded into a register 38. The selected N is added to a local program counter value LAPC. along with 
UVPCSQ- by means of an adder drcuit 31 to produce the ^Mredteted jump destination address PCN. 

3) A two-bit signal JTYPE. This is stored In a register 32. and specifies the type of the detected jump 
30 instruction, as follows: 
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JTYPE 


Type 


00 


Unconditional and literal 


01 


Conditional and literal 


10 


Unconditional and non-iiteral 


11 


Conditional and non-literal 



(4) A two-bit sispial JlMPtD which further specifies the jump instnictton. as folkiws: 
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JUMPID 


TVPE 


01 


JAT 


10 


JAF 


11 


JCC 


00 


Other 



The signal JUMPID is stored In a register 37. 

The parameter N is decoded in a decoder drcuit 39 to pnxiuce a GUESS stgnat which represents a 
prediction of the outcome of the jump irtstruction, based on an internal atbrilnite of the jump instructton. In 
this particular example. GUESS is true if N is less than 16. In other words, it Is predicted that short fbnward 
jumps and all backward jumps will be made; conversely, longer forward jumps (with N greater than or equal 
to 1 6) will not be made. It has been found that this gives a conrect predtotton significantly more than 50% of 
the time. GUESS is stored in a register 33. 

The jump predictton drcuit also indudes a random-access memory (RAM) 40 having 1024 tocations. 
Each tocatton holds 16 bits as follows: 
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btts 


name 


10-15 
2-9 
1 
0 


JINVERT (0-5) 
VAUD (0-7) 
PUBUC 
PARITY 



JINVERT (0-5) are "guess invert" bfts which predict wtiether or not the GUESS signal produced for a 
particuiar jump Instruction is correct JINVERT (0-2)relate respectively to JAT. JAF and JCC instructions 
starting in word 0 of a double-word block, while JINVERT (3-5) relate respectively to JAT, JAF and JCC 
instaictlons starting in word 1 of a double-word btodc Thus, for example. If JINVERT (0) is true. tWs 
indicates lhat a GUESS produced for a JAT instruction starting In word 0 is predicted to be inconrect and 
hence must be inverted. 

The RAM 40 is addressed by a 10-bit hash address formed from a 29-bit address from a multiplexer 
41. In this example, the hash address is fomned by taking a predetemnined selection of ten out of tiie 29- 
bits. However, in other embodiments of tiie invention It may be fbnmed for ocample by combining pairs of 
tiie address bits in exclusive -OR gates. 

The multiplexer 41 nonnally selects tiie output LAPC of a register 42. This represents a local copy of 
tiie program counter PC, and Indicates ttie address of tiie bk>ck currentiy being processed by tiie jump 
prediction circuit The register 42 is normally incremented, in an adder circuit 43, each time a new block is 
processed by tiie prediction circuit When tiie program counter PC Is updated to perform a jump, a 
multiplexer 44 is switched so as to write tiie jump destination address (signal CVSA) into tiie register 42. 

The contents of tite addressed tocation of tiie RAA440 are read out and are used as foitows. 

. The six guess invert bfts JINVERT (0^ are written into a register 45. The output of tills register Is in 
turn applied to tiie input of a multiplexer 48, controlled by tiie most signiffcant Wt of LAPCSEL. and tiie two 
bits of JUMPID. The multiplexer 46 tiius selects tiie one of tiie guess invert bfts conresponding to tiie 
current instruction type and position within tiie two word bkxdc For example, if JUMPID indicates a JAT 
instruction and tiie- most significant Wt of LAPCSEL indicates tiiat tiie instruction starts In word 0. tfien tiie 
multiplexer 46 selects JINVERT (0). 

The VALID bits from the addressed kx^tion of RAM40 are compared In a comparator circuit 47 witii the 
contents of a process tag register 4a The output of tiie comparator 47 is combined In an OR gate 49 with 
tiie PUBLIC M from tiie addressed kjcation of ttie RAM. to produce a signal RAMHIT which is stored In a 
register 50. Thus, it can be seen ttiat RAMHIT is true if tiie VAUD bits match tiie contents of tire process 
tag register 48, or if ttie PUBUC bit is set 

The pnxess tag register 48 Is incremented each time there is a change of process (program) In the 
system, and tills effectively invalidates all tiie entries in tiie RAM 40 relating to different processes, except 
for ttiose entries relating to shared public code, which are indteated by ttie PUBUC bit 

The RAMHIT signal indicates whettier or not ttie guess invert signals JINVERT(0^ are to be taken as 
valid. 

RAMHIT controls a multiplexer 51, When RAMHIT is tme. ttie multiplexer 51 selects ttie output of an 
exclusive -OR gate 52 which combines ttie GUESS signal witti ttie selected JINVERT bit from multiplexer 
46. When RAMHIT Is false, ttie muitiplwcer selects GUESS. The output of ttie multipl^cer 51 is a 
JUMPPREDICT signal which incficates ttie predicted outcome of ttie jump instniction. 
Thus, it can be seen ttiat if RAMHR" indicates ttiat JINVBTT is not valid, ttien ttie GUESS signal Is used 
directiy as ttie final prediction JUMPPREDICT. If , on ttie ottter hand, RAMHIT indicates tiiat JINVERT is 
valid, tiien tiie final prediction JUMPPREDICT Is fonmed by selectively inverting GUESS, according to ttie 
value of ttie appropriate JINVERT bit 

JTYPE Is ttien used to determine ttie final action, as follows: 

(1) If JTYPE = 00. a jump is made to ttie predicted destination PCN. 

2) If JTYPE = 01. a jump is made to PCN if JUMPPREDICT Is true. 

(3) If JTYPE - 10 or 11, no jump is made. 

The way in which tiie RAM 40 is updated will now be described. 

As indicated above, whenever ttie lower pipeline 13 detects an inconrectiy predicted jump Instruction, it 
sends a corrected program counter value CPC to the scheduler, so as to return to tiie conrect sequence of 
instructions. At tiie same time, tiie lower pipeline sends ttte address JSOURCEPC of ttie jump instruction in 
question to tiie jump prediction circuit This is loaded into a register 53 and Is tiien used to address tiie 
RAM 40 by way of a multiplexer 41. 
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The appropriate one of the JINVERT t)fts of the currency addressed RAM location, according to the 
type and position of the jump instruction. Is then inyerted. For example, if the inconrectly predcted jump 
instruction is a JAT starting In word 0. then JINVERT{0) is inverted. 

At the same time, if the VAUD bits in the addressed location match the process tag in register 48. the 
5 remaining JINVERT bits are wrftten back unchanged into the addressed location of the RAM. If. oh the other 
hand, the VALID bits do not match the progress tag. then these remaining JirA/ERT tMts are alt reset, the 
cun-ent process tag value is written into the VALID bits, and the PUBLIC bit is updated. 

In summary, it can be seen that the system uses a two-level lump prediction mechanism. The first level 
of prediction provides the GUESS signal and is based on an internal attribute of the jump instruction itself. 
10 The ^cond level of prediction Is provided by the RAM 40. and is based on the history of the outcome of 
the jump instruction the last time it was performed. Specifically, the second level of prediction takes the 
fomi of an indication of whettier tiie Initial prediction provided by the GUESS signal was correct 

If the predkrtion provided by RAM 40 Is not valid (RAMHIT false), for example because this particular 
jump instruction has not been encountered before, tiien the initial GUESS Is used dlrectfy as a final jump 
IS prediction. However, if the RAM predkrtion is valid, then it is used to conrect the initial GUESS to produce 
the final prediction. 

It has been found that ttiis two-level prediction mechanism Is particularly advantageous in that it altows 
a prediction to be made even If tiiere Is no prevkMisly recorded history of tiie outcome of a particular 
instruction, vrhite allowing a more accurate prediction when ttie history is available. 
20 Taking tiie RAM contents to signify whether or rx^ to invert tiie kiitiat quess. rattier tiian simply as a 
prediction to be substituted fbr the initial guess. aOows several predictions to be stored In each line of the 
RAM. 

Moreover, the fact tiiat tiie RAM Is written to only when a jump has been toxrectiy prBdk:tBd gieatiy 
reduces tiie number oi write accesses to ttie RAM. and hence makes more effective use of tite storage 

25 capacity of tiie RAM. It also reduces the possibility of "tiirashing" i.e. tiie situation where two diff^nt 
instructions alternately set and reset tiie same JINVERT bit 

It shouki be noted ttiat because tiie RAM 40 Is hash-addressed, several different instnictions may map 
onto tiie same location of tiie RAM. As a result, some of tiie JINV5TT outputs firom tiie RAM may be 
spurious. In that they may related to a different instruction from ttiat which is currently being processed by 

30 tiie predkrtion circuit The possibility of such spurious outputs being used Is reduced by tiie use of the 
VALID bits and tiie process tags, whteh invalidate locations In tiie RAM whteh relate to processes otiier tiian 
tiie current one. The possitMlity of spurious outputs Is also reduced by having separate JINVERT bits for 
each dff^rent type of concfltional jump Instructions In different wonJ locations of a two-word bkxdc 

35 

Clainis 

1. Data processing apparatus comprising: 
(a) means (10) for fetching a series of instructions. Including conditional jump Instmctions. 
40 (b) means (11 , 12. 13) fbr executing the series of instructions, and 

(c) prediction means (28) for predicting tiie outcomes of execution of the conditional jump instructions on 
the basis of a stored history of prevkxis occurrences of those instructions. 

characterised in ttiat when a stored history is not avaiteble fbr a particular conditional jump instruction, tiie 
prediction means (28) predk:1s ttie outcome of ttiat instmction on ttie basis of an internal attribute (N) of that 
45 instruction. 

Z Data processing apparatus comprising 

(a) means for fetching a series of instnictions. including conditional jump instnictions. 

(b) means for executing tiie series of Instructions, and 

(c) a memory (40) for storing a Nstory of execution of tiie conditional jump instructions, characterised 

50 by 

(d) prediction means (39) Ibr making a predk:tion of ttie outcome of a conditional jump instruction on 
the basis of an internal attribute (N) of tiiat instruction, and 

(e) prediction connection means (46. 52) operable when tiie history of execution of a particular jump 
Instruction is available in tiie memory (40), fbr correcting said predk:tion in accordance witti tiie stored 

55 history of execution of tiiat instruction. 
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3 Apparatus according to claim 2 wherein the memory (40) holds a plurality of bits, each of which 
' indicates whether the Jump prediction for a particular jump instruction was conrect on a previous occasion 
when said jump instruction was executed. 

4. Apparatus according to claim 2 or 3 wherein said internal attribute (N) is a jump parameter. 
5 5. Apparatus according to claim 4 wherein said prediction means (39) predicts a successful jump If said 
jump parameter (N) Is less than a predetermined value. ^ 

6. Apparatus according to any one of dams 2-5 wherein said memory (40) is addressed by a hasli 
address derived from the address of the jump Instmctlon cunrently being processed by the prediction 

means. . u 

10 7. Apparatus according to any one of claims 2-6 wherein said memory (40) is updated only when 

inconrect jump prediction is detected. 

a Apparatus accorrfng to any one of claims 2-7 wherein each location of said memory (40) holds a 
plurality of jump history indications, one for each of a plurality of different types of jump instruction, the 
a(^)aratus also Including means (47) for selecting one d the jump history indications according to the type 
IS of cunrent jump instruction. 

9. Apparatus according to any one of claims 2-8 wherein each location of said memory holds a validity 
field which Is set to the value of a process tag when the location Is updated, the apparatus also includes 
means for comparing the validity field of the cunrently addressed location with said process tag to produce 
a valkJily signal when they match, said validity senate indlcaBng that said stored history Is available. 

20 
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Fig. I 
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Fig. 3. 
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